HMM-Based Recognition and Adaptation of Persian Children's Speech
نویسندگان
چکیده
There are high variability in children's speech compared to adults' which is mainly because of their shorter vocal tract length and smaller vocal fold which results in lower accuracy in speech recognition task (about 54.5% in this work). Therefore using adaptation techinques which reduce these variabilities has been suggested. In this paper we focused on the problem of speech recognition for Persian children using adaptation techniques performed on two models: one trained on children's speech and one on adults'. We used a speaker normalization method which is combination of vocal tract length normalization and model adaptation. Experiments have shown that using adult model has low performance which inreases 37% when using adaptation techniques. It is also shown that using these 222 G. Tadayon Tabrizi, S. Setayeshi and M. Molavi Kakhki techniques will increase recognition rate by 7% when using recognizer trained on children's speech.
منابع مشابه
Presentation of K Nearest Neighbor Gaussian Interpolation and comparing it with Fuzzy Interpolation in Speech Recognition
Hidden Markov Model is a popular statisical method that is used in continious and discrete speech recognition. The probability density function of observation vectors in each state is estimated with discrete density or continious density modeling. The performance (in correct word recognition rate) of continious density is higher than discrete density HMM, but its computation complexity is very ...
متن کاملImproving Phoneme Sequence Recognition using Phoneme Duration Information in DNN-HSMM
Improving phoneme recognition has attracted the attention of many researchers due to its applications in various fields of speech processing. Recent research achievements show that using deep neural network (DNN) in speech recognition systems significantly improves the performance of these systems. There are two phases in DNN-based phoneme recognition systems including training and testing. Mos...
متن کاملPresentation of K Nearest Neighbor Gaussian Interpolation and comparing it with Fuzzy Interpolation in Speech Recognition
Hidden Markov Model is a popular statisical method that is used in continious and discrete speech recognition. The probability density function of observation vectors in each state is estimated with discrete density or continious density modeling. The performance (in correct word recognition rate) of continious density is higher than discrete density HMM, but its computation complexity is very ...
متن کاملSpeaker Adaptation in Continuous Speech Recognition Using MLLR-Based MAP Estimation
A variety of methods are used for speaker adaptation in speech recognition. In some techniques, such as MAP estimation, only the models with available training data are updated. Hence, large amounts of training data are required in order to have significant recognition improvements. In some others, such as MLLR, where several general transformations are applied to model clusters, the results ar...
متن کاملOff-line Arabic Handwritten Recognition Using a Novel Hybrid HMM-DNN Model
In order to facilitate the entry of data into the computer and its digitalization, automatic recognition of printed texts and manuscripts is one of the considerable aid to many applications. Research on automatic document recognition started decades ago with the recognition of isolated digits and letters, and today, due to advancements in machine learning methods, efforts are being made to iden...
متن کامل